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Abstract: Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn 
moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud 
data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce 
mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud 
top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against 
independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill 
in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data 
assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the 
Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able 
to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines 
a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. 
One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the 
cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence 
on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard 
(1998) provides some help in this respect, by better honoring inversion structures in the background state. Copyright © 2013 Royal 
Meteorological Society 
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1 Introduction 

In the introduction to Part 1 we provided motivation for 
this study by discussing a number of the difficulties asso- 
ciated with the subject of cloud data assimilation (CDA). 
We particularly emphasized problems associated with the 
mismatch between the frequently small scales of cloud 
variability and typical GCM gridcolumn footprints and 
with the strong non-linearities present in cloud processes. 
One of the key problems is that a subsaturated back- 
ground state cannot produce clouds via any small equi- 
librium perturbation to moisture. We then provided a 
detailed description of a new Monte Carlo Bayesian CDA 
approach designed to address these problems, with the 
goal of improving poor model background states to the 
point where more traditional CDA approaches are able to 
perform more favorably with them. 

This Part II now discusses the application of the new 
method and its performance in a number of case studies 
and sensitivity tests. Section 2 sets up a control cloud data 
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assimilation experiment with the new method, making a 
few necessary modifications to the system described in 
Part 1. It is verifed that the control does indeed signif- 
icantly reduce mean and standard deviation biases with 
respect to the assimilated MODIS cloud optical depth, 
brightness temperature and cloud top pressure. Section 3 
examines a number of algorithmic and physical sensitivi- 
ties of the new method and provides guidance for its cost- 
effective implementation. The section also demonstates 
the significant utility of a simple flow-dependent corre- 
lation function modification due to Riishojgaard (1998). 
Section 4 presents a case study involving marine stratocu- 
mulus off the Californian coast. This case is an especially 
difficult test of the new method, because the observations 
show stratocumulus very near and sometimes right up to 
the coast, whilst the background state has a large clear 
swath in this region. The results are very encouraging. 
Section 5 presents an short application of the method to 
assimilation of geostationary cloud retrievals made for the 
SEVIRI instrument aboard the Meteosat-9 platform. The 
method is easily adaptable to assimilation of this data. 
A comparison against the assimilation of contemporane- 
ous polar-orbiting Aqua MODIS data shows comparible 
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results for both types of observations. Section 6 presents 
an independent validation of the new method using a non- 
assimilated dataset, the so-called cloud Optical Centroid 
Pressure (OCP) retrieved from a rotational-Ramman scat- 
tering algorithm applied to UV spectral measurements 
made by the Ozone Monitoring Instrument (OMI) aboard 
the Aura satellite. Finally, Section 7 contains some con- 
clusions. 

2 Control Run 

A number of sensitivity tests have been performed (see 
below) by perturbing key algorithmic and physical param- 
eters of the Bayesian cloud assimilation system. The con- 
trol run for these tests is largely defined by Part I, with the 
following details and modifications: 

(1) N sim = N. mm = 64. 

(2) The number of trials M per point in the MTM 
(Multiple-try Metropolis) chain is a factor / M = 1/2 of 
that specified at the end of Section 2.9 of Part I, namely 
about half the number of effectively independent dimen- 
sions in the parameter space. 

(3) The MTM proposal covariance matrix T q of (17) in 
Part I is amplified by a factor C = 32, since the proported 
advantage of MTM over MH (Metropolis-Hastings) is that 
it allows for larger proposal steps. 

(4) An MTM chain length of n = 200. 

(5) A vertical correlation length scale of L = 100 hPa. 

(6) Despite the considerable success of an assimilation 
run with the above specifications, namely (l)-(5) above, 
an examination of the results revealed excessive cloud 
water path CWP = Jp c dz in the nighttime regions. In 
reprospect this is not surprising — at night there are 
no cloud optical thickness observations to contain the 
vertically integrated condensate in each gridcolumn. The 
remaining observables, T& and p c , both saturate very 
quickly with cloud water path and so cannot constrain 
CWP. To resolve this issue, for nighttime gridcolumns 
only, we multiply the prior term p(a) of Part I by a 
Gaussian-like term 

e -C\[ln(VA B )] 2 

that constrains the ratio of the condensed water path to 
the total water path, A = CWP / TWP , to its triangular- 
ized background state value Xg, where TWP = f (p v + 
p c ) dz. We have found that an exponential constant C\ = 
256 works well. 

(7) On closer examination of the assimilated MODIS 
brightness temperatures, it was found that a few were 
unexpectedly very cold (< 220 K). We had thought such 
high clouds would use p c rather than Tb assimilation, since 
C02-sliced cloud top pressures of p c < 550 hPa are con- 
sidered reliable (see Section 2.6 of Part I). Upon inspec- 
tion, it was found that these particular clouds, although 
reporting a cloud top pressure p c < 550 hPa, were flagged 
by the MODIS algorithm as IR-window rather than CO 2 - 
slicing derived, indicating that the CO 2 -slicing retrieval 
had failed for some reason. This was causing our pixel 
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gridding algorithm, described in Section 2.6 of Part I, to 
choose TJ, rather than p c as the observable. This was, how- 
ever, inconsistent with the forward model, which always 
uses p c as the observable for p c < 550 hPa. As a conse- 
quence, the analysis was reporting extreme negative O-A 
(observation minus analysis) biases in Tb at these loca- 
tions, because the only forward modeled 7), were, by 
definition, for lower clouds. The solution was simple: to 
use p c as an observable for any observation with a p c < 
550 hPa, regardless of whether flagged as IR-window or 
C02-slicing derived, in order to achieve consistency with 
the forward model. This solution eliminated the problem 
with very negative O-B biases in Tb, and the p c biases also 
appear to be fine (see results below). 

With these modifications, the resulting run, denoted 
b7K64fhC32, forms the control for the sensitivity tests 
that follow. The run was made for the day of July 1, 201 1, 
comprising eight three-hour cloud analyses. 

Figure 1 shows global plots of cloud optical thickness 
r, cloud brightness temperature 7/ and cloud top pressure 
Pc for the control run for each of the triangularized back- 
ground (B), observations (O), and result of the Bayesian 
analysis (A). Note that, in general, the analysis is closer 
to the observations than the background. We will be more 
quantitative shortly. 

Note that in all plots and analysis presented in this 
Part II: (1) the use of the term “background” refers 
to the triangularized model background state, namely 
the gridcolumn of skewed triangle PDFs produced by 
the initialization procedure presented in Part I; (2) r 
is an all-sky gridcolumn mean cloud optical thickness 
(zeros included for clear pixels/subcolumns), and 7) and 
p c are in-cloud gridcolumn means of cloud brightness 
temperature and cloud top pressure, respectively; (3) Tb is 
the mean for only the pixels or subcolumns for which Tb is 
actually used, namely when p c > 550 hPa, and conversely 
Pc is the mean for only p c < 550 hPa. r is the mean over 
only daytime gridcolumns. 

Figure 2 shows the corresponding global plots of the 
O-B and O-A biases. Clearly the analysis biases are much 
smaller than the background biases. Figure 3 shows the 
PDFs of O-B and O-A biases in r for the control run. 
The peaks of the O-A PDFs are all significantly narrower 
than for the corresponding O-B PDFs, again indicating 
the basic success of cloud assimilation method. Both the 
modal and the mean biases are also generally smaller in 
magnitude for O-A than O-B. The figure also shows that 
the analysis was quite successful in removing unobserved 
clouds from the background (see figure caption). 

Figures 4 and 5 show the corresponding bias PDFs 
for the cloud brightness temperature Tb and C02-slicing 
based cloud top pressure, p c . Similar conclusions can be 
drawn as from Figure 3. Note that the inset panels have a 
slightly different interpretation than for Figure 3. Namely, 
for Figure 4 the • refers to the existence of a Tb value, that 
is a “low cloud” (p c > 550 hPa), while the o represents not 
only clear pixels/subcolumns, but those which do not use 
Tb because the cloud top is too high ( p c < 550 hPa). Based 
on this interpretation, the inset illustrates that the fraction 
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Figure 1. Global plots of all-sky gridcolumn mean cloud optical thickness r (top) and in-cloud gridcolumn mean cloud brightness 
temperature T b (mid) and cloud top pressure p c (low) for the control run b7K64fhC32 for each of the triangularized background (B), 

observations (O), and analysis (A). 


of modeled low clouds not seen in the observations is 
reduced by the analysis (transfer of upper-left red to 
lower-left blue) and also that the fraction of observed 
low clouds that are not modeled is also reduced by the 
analysis (transfer of lower-right red to upper-right blue). 
Analogous comments apply for Figure 5. 

In view of the clear difference between day and night 
assimilation [see item (6) above] we also computed the 
global bias statistics for the control run separately for 
day and night gridcolumns. Table I shows the results. 
Note the following: (1) the background biases O-B are 
significantly smaller in the mean during the day than 
during the night, but very similar in standard deviation; 
(2) the correction of mean bias in TJ, and p c by the cloud 
analysis is significantly stronger during the night than 
during the day, and the same also applies for the standard 
deviation in the bias, though to a much lesser degree. In 
view of item (6) above, it would seem that the daytime 
analysis, with its additional constraint on the r observable, 
achieves a reduced alignment to the Tb and p c observables. 


Table I. Summary of the global biases for the control run 
b7K64fhC32, and for the control run with day and night subset- 
ting. The format of each box is O-B -A O-A, where the mean bias is 
reported in the upper section, and the standard deviation in the bias 
in the lower section. No bias for r column is reported, since r is 
only available during the day. 


mean 

T b 

Pc 

b7K64fhC32 

1.91 -A 0.71 

37.93 -A 8.76 

day only 

1.48 -A 0.96 

32.17 -A 14.39 

night only 

2.21 -A 0.54 

41.54 -a 5.25 

sdev 

T b 

Pc 

b7K64fhC32 

6.09 -A 3.59 

95.41 -A 58.19 

day only 

6.04 -A 4.13 

97.12 -A 62.43 

night only 

6.12 -A 3.16 

94.14 -A 55.10 


and that addition of the nighttime prior constraint on 
CWP / TWP, although having a similar effect, is not as 
dominant. 
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Figure 2. Global plots of O-B and O-A biases for the control run b7K64fhC32 for the observables in Figure 1. Clearly the analysis biases 

are much smaller than the background biases. 


3 Sensitivity Tests 

Table II lists the global mean biases in r, 2], and p c , and 
the standard devation (sdev) in the biases, for the con- 
trol run b7K64fhC32 and for several sensitivity exper- 
iments varying key algorithmic parameters. The exper- 
iment b7K64fhC16 reduced the proposal covariance by 
half with C = 16 and produces a slightly worse mean 
and sdev O-A than the control. Tests with C = 64 (not 
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shown) produced mixed results when compared with the 
control, but no significant improvement. The experiment 
b7K64flC32 uses twice as many trials per MTM chain 
point as the control, namely /m = 1- We expected this to 
be better, since using the same number of trial points as 
there are effectively independent dimensions in the par- 
ameter space seems reasonable. Indeed the results are 
slightly better, but certainly not enough so to justify twice 
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Figure 3. PDFs of the biases O-B (red) and O-A (blue) in all-sky gridcolumn mean cloud optical thickness for control run b7K64fhC32. 
Biases are shown for six regions, as noted, where the latitude boundaries between tropics, mid- and high-latitudes are at 22.5 and 50 
degrees. The peak of the O-A PDFs are all significantly narrower than the corresponding O-B PDFs, indicating the basic success of cloud 
assimilation method. Both the modal biases (location of the peak) and the mean biases (dashed vertical lines) are also generally smaller in 
magnitude for O-A than O-B. The mean and standard deviation of the biases are also explicity printed in the upper left of each panel. Note 
that all these results are for the subset of gridcolumns that are non-clear (either partially cloudy or overcast) for both the observations (in 
the gridcolumn) and the model (B or A). This subset is also represented in the •• quadrant of the upper right inset of each panel. This inset 
shows the four-way fractional split between clear (o) and non-clear (•) cases for the observations O and the model M (B or A). The fact 
that there is a transfer of probability from cloudy B to clear A when the observations are clear (i.e., red in upper left to blue in lower left 
quadrants) is also indicative of the success of the cloud assimilation method. 


Table II. Summary of the variation in global biases among the different algorithmic sensitivity experiments. The format of each box is 
O-B — » O-A, where the mean bias is reported in the upper section, and the standard deviation in the bias in the lower section. 


mean 

T 



T h 


Pc 

b7K64fhC32 

-4.82 -a 

-2.18 

1.91 

— > 

0.71 

37.93 -A 8.76 

b7K64fhC16 

-4.82 -a 

-2.17 

1.90 

— ► 

0.79 

37.96 -A 9.92 

b7K64flC32 

-4.83 -A 

-1.87 

1.91 

— > 

0.66 

37.91 -A 6.89 

b7K128fhC32 

-4.78 -A 

-1.99 

1.96 

— ► 

0.67 

37.49 -A 6.11 

b7K64fhC32_n400 

-4.82 -a 

-1.86 

1.91 

— > 

0.64 

37.92 -A 6.80 

b7G64fhC32 

-4.82 -a 

-1.94 

1.91 

— ► 

0.62 

37.95 -A 10.41 

sdev 

T 



T b 


Pc 

b7K64fhC32 

15.50 -A 

7.02 

6.09 

— > 

3.59 

95.41 -a 58.19 

b7K64fhC16 

15.51 -A 

7.23 

6.09 

— ► 

3.81 

95.38 -A 61.15 

b7K64flC32 

15.51 -A 

6.64 

6.09 

— > 

3.49 

95.36 -A 56.92 

b7K128fhC32 

15.40 -A 

6.44 

6.08 

— ► 

3.39 

95.27 -A 55.87 

b7K64fhC32_n400 

15.50 -A 

6.47 

6.09 

— > 

3.41 

95.38 -A 54.93 

b7G64fhC32 

15.50 -A 

6.66 

6.09 

— ► 

3.49 

95.36 -A 54.21 


the expense of this experiment (doubling the number of 
trials approximately doubles the experiment timing). The 
experiment b7K128fhC32 doubles AW (and N, mm ) to 
128. On the whole, it is slightly better, but again, not 
enough so to justify twice the expense. The experiment 
b7K64fhC32 _n400 doubles the length of the MTM chain 
to 400 elements. It is also slightly better than the control. 


but yet again not worth twice the runtime expense. All the 
above experiments involve algorithmic perturbations, and 
we conclude that the control b7K64fhC32 is satisfactory 
and cost effective. 

One final algorithmic sensitivity experiment, 
b7G64fhC32, replaced the kernel density estimate (KDE) 
method of likelihood evaluation, described in Part I, 
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Figure 4. As in Figure 3 but for brightness temperature Tj, and including nighttime as well as daytime gridcolumns. See the text for 
important comments on the inset, which has a different interpretation than for Figure 3. 


0.025 


0.020 


Global 


mid-lats 


NH-high-lats 



200 


o.ooo fe 

-200 -100 


-200 -100 


0.040 
0.035 
0.030 
0.025 
0.020 
0.015- 
0.010 - 
0.005 - 

oF 00 - 


200 -100 


200 


O-B(r), O-A(b) in CTP [hPa] 


O-B(r), O-A(b) in CTP [hPa] 


O-B(r), O-A(b) in CTP [hPa] 


Figure 5. As in Figure 4 but for C02-slicing based cloud top pressure. See the text for important comments on the inset. 


Section 2.8, with a simple 2D Gaussian likelihood 
evaluation, also discussed in that section. While this 
Gaussian likelihood forces the likelihood functions 
P» Po ((hiT,p c )|a) and ^.^((lnr, Ti,)|a) to be unimodal 
and of Gaussian form (e.g., with elliptic contours), the 
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results from Table II are actually a small improvement 
on the control in all measures except global p c bias. 
Although we retain the KDE likelihood in the control, 
for its generality and all-round good performance, we 
intend to continue to investigate the Gaussian likelihood, 
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because the simplicity of the Gaussian may allow some 
simplification and speed-up of the code (and perhaps 
a simpler integration of our CDA method with more 
traditional 3D- and 4D-Var approaches.) That being said, 
the Gaussian likelihood version is not currently faster 
than the KDE version because the statistical subcolumn 
generation dominates the evaluation of the likelhood at 
the observations. This might be reversed with further 
coding improvements. [We also plan to try a Gaussian 
Copula (GCOP) likelihood, a middle ground between the 
KDE and Gaussian likelihoods, as mentioned in Part I. 
This is distinct from the GCOP model of layer overlap 
already included in this study.] 

Table III examines the sensitivity of global 
biases to more physical parameters. The experiment 
b7K64fhC32X200 doubles the vertical correlation length 
scale L to 200 hPa. This is somewhat worse than the 
control. It is technically possible to include L in the 
parameter list for Bayesian inference, but we have not yet 
tried this. The experiment b7K64lhC32 pTrop replaces 
the fixed p rdmp = 100 hPa of Part I, Section 2.5, with a 
variable tropopause pressure, as output by the background 
GEOS-5 simulation, and replaces the fixed pii m = 50hPa 
with p ramp — 50 hPa. [Note: p ramp is the lowest pressure 
at which full parameter variability is permitted. Above 
this level, the parameter variability decays linearly with 
pressure, to zero at pn m . This ramp effects both the S 
prior of Part I, Section 2.5, and the MTM estimated 
target covariance, T n , discussed near the end of Part 
I, Section 2.9]. The results for b7K64lhC32 pTrop are 
very similar to the control run, with the exception that 
the mean cloud top pressure bias is significantly reduced 
(from a 77% reduction O-B — » O-A to an 83% reduction). 
This improvement makes sense: we are constraining the 
analyzed cloud top pressure to approximately below the 
background tropopause, rather than a nominal level of 
100 hPa in the control. In retrospect, this modification 
should have been included in the control, and will be for 
future work, but not for the current paper, except where 
noted. 

The experiment b7K64fhC32JnP uses In p rather 
than p in matters relating to the specification of vertical 
correlation. This has some justification, since lnp is 
a proxy for the height above the surface, z, and it is 
more common to parameterize vertical correlation in 
terms of z than p. The implementation is as follows: 
in the evaluation of the vertical correlation matrix C in 
Part I, equation (5), £kk' = \pk ~Pk'\/L is replaced by 
(,kk' = | In pk- lnpfc'l/Lin, where L ln is a “length scale” 
in lnp, given by L\ n = ln[(p re f + Ap/2)/(p ref - Ap/2)], 
such that p re f + Ap/2 and p re f — Ap/2 are separated 
by one length scale in lnp, where p K f is some ref- 
erence pressure, set to 500 hPa for this experiment, 
and Ap=100hPa, akin to the L = 100 hPa of the 
control. This means that 450 and 550 hPa have the 
same decorrelation in both the control b7K64fhC32 
and the b7K64fhC32JnP experiment, but that other 
levels separated by 100 hPa now have different decor- 
relations from the control. In particular, upper levels 


separated by 100 hPa are relatively less correlated, since 
A p higher in the atmosphere represents a relatively 
greater altitude difference, than the same A p near the 
surface. In addition, the linear ramp in p between p ramp 
and Piim, discussed above, for near tropopause decay 
of parameter variability, now becomes linear in lnp. 
Lastly, in the calculation of the number of trials M 
per point in the MTM chain (end of Part I, Section 
2.9), {po~Pnm)/L is replaced by ln(p 0 /piim)/^in- The 
performance of the b7K64fhC32JnP experiment is at 
best mixed, with small improvements for p c and degra- 
dations for T h . The InP experiment is also about 60% 
more expensive than the control, due to the increase 
in the number of trials M per MTM chain point, as 
described above [\n(p 0 / p hm ) / L ln -t- (p 0 - pum)/L = 

ln(1000/50) / ln(550/450) -7 950/100 « 15 -E 9.5 « 1.6]. 
Using a p re f lower in the atmosphere exacerbates this 
problem. The experiment b7K64fhC32JnPtr is a com- 
bination of b7K64fhC32_pTrop and b7K64fhC32JnP. 
The pTrop modification generally lowers pu m within 
the atmosphere and therefore decreases M, resulting 
in a small overall decrease in computation time for 
b7K64fhC32JnPtr relative to the control. However the 
results from Table 111 are not, on balance, better than the 
InP experiment, with a slight improvement in p c being 
offset by a slight deterioration in COT. 

Finally, the experiment b7K64fhC32_RiiSl imple- 
ments a version of the simple flow-dependent correlation 
function proposed by Riishojgaard (1998). The idea is 
to replace the “pressure distance” based vertical correla- 
tion matrix C of Part I, equation (5), by the Hadamard 
product of itself and another correlation matrix D with 
elements D ^ = u(\9(a\) — 0(a£,)|), where u is a cor- 
relation function on R and 9 is a function of the back- 
ground state vector a b for a layer. In this study we use the 
Gaussian correlation function i/(r-a) = exp[— r 2 /(2cr 2 )]. 
Riishojgaard suggests the use of some conservative func- 
tion for 9, on the basis that Lagrangian transport pre- 
serves conservative properties. In effect, the D matrix 
would therefore decorrelate regions which have different 
Lagrangian origins. The use of a conservative 9 is most 
important for horizontal correlations, since advection- 
based variability dominates in horizontal. In the vertical, 
other processes such as turbulent diffusion and radiatiave 
heating become significant. For our case of vertical corre- 
lations, we could use the conservative total water content 
q t , but we can just as well use the control variable S, the 
mean total saturation ratio, which has a smaller dynamic 
range but still captures well the sharp humidity inversion 
at the top of the planetary boundary layer, for example. 
This is really our goal — to use the D term to provide 
strong decorrelation across inversion features in the ver- 
tical, which are otherwise not adequately decorrelated by 
the fixed pressure scale L of the control C model. Thus the 
experiment b7K64fhC32_RiiSl applies the above D mod- 
ification, with 9(a b k ) = S b and a standard deviation a = 
ksc r§, where erg = 0.1 is the prescribed standard devia- 
tion of S in the control (see Part I, Section 2.5 and 2.9) and 
Kg is a constant, set to one for this experiment. Looking 
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Table III. Summary of the variation in global biases among the sensitivity experiments examining physical parameters. The format is the 

same as in Table II. 


mean 




T b 


Pc 

b7K64fhC32 

-4.82 -a 

-2.18 

1.91 

— > 

0.71 

37.93 -A 8.76 

b7K64fhC32_L200 

-4.40 -a 

-3.61 

1.79 

— ► 

1.15 

40.95 -a 17.40 

b7K64fhC32_pTrop 

-4.81 -A 

-2.16 

1.91 

— > 

0.66 

37.90 -A 6.26 

b7K64fhC32_lnP 

-4.64 -a 

-2.21 

1.99 

— ► 

1.05 

37.81 -A 6.66 

b7K64fhC32_tnPtr 

-4.62 -a 

-2.34 

1.99 

— > 

1.04 

37.84 -A 5.31 

b7K64fhC32_RiiS 1 

-5.12 -a 

-0.66 

2.09 

— ► 

0.35 

35.15 -A 3.05 

sdev 

T 



T b 


Pc 

b7K64fhC32 

15.50 -A 

7.02 

6.09 

— ► 

3.59 

95.41 -a 58.19 

b7K64fhC32_L200 

15.26 -A 

9.07 

6.17 

— ► 

4.29 

95.13 -A 67.50 

b7K64fhC32_pTrop 

15.49 -A 

7.12 

6.09 

— ► 

3.55 

95.37 -A 56.94 

b7K64fhC32_lnP 

15.40 -A 

7.02 

6.14 

— ► 

3.80 

95.21 -a 55.61 

b7 K64fhC3 2 JnPtr 

15.39 -A 

7.29 

6.14 

— > 

3.82 

95.21 -a 54.95 

b7K64fhC32_RiiS 1 

15.70 -A 

6.22 

6.05 

— ► 

3.23 

95.68 -A 53.65 


at the results from Table III, we see that this experiment 
is superior to the control in all measures. We believe that 
using the D matrix is better honoring the basic vertical 
structure of the atmosphere, as represented in the back- 
ground state, and to the extent that this background state 
is able to capture realistic vertical stmcture, is producing 
an analyzed state more consistent with the observations. 
In retrospect, this modification should have been included 
in the control, and will be for future work, but not for the 
current paper, except where noted. 

4 California Stratocumulus Study 

Figure 6 shows a case study off the west coast of North 
America in which marine stratocumulus is present right up 
to the coast in Southern California and Baja California, but 
is absent in a wide swath off this coast in the background. 
In such cases it is typically very difficult for a cloud 
data assimilation system to restore equilibrium cloud to 
such regions because the background, subsaturated clear 
regions cannot produce cloudiness for any small equi- 
librium perturbation to the moisture field. However, the 
Monte Carlo Bayesian system is able to restore the stra- 
tocumulus, since it makes finite jumps in parameter space. 
The cloud brightness temperature Tj, in the restored region 
is in good agreement with the observations, although the 
cloud optical thickness appears a little too high. Note that 
these results are for the experiment b7K64lhC32 RiiSl, 
that is, the control with the Riishojgaard (1998) correla- 
tion modification. While the pure control was almost as 
good, and certainly also restored the near coast stratocu- 
mulus, it produced a more noisy analysis with even higher 
COT in the restored region. It seems that the Riishojgaard 
modification is acting, as anticipated, to honor the back- 
ground moisture inversion structure, thereby limiting the 
ability of the analysis to produce excessive cloud thick- 
ness by artificially raising the cloud top. 

Note that our earlier experiments with this marine 
stratocumulus case were a failure. In those experiments 
we were using r and p c only as observables, not T&. The 
failure occurred because the MODIS retieved CO 2 -slicing 
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cloud top pressure is unrealiable for low clouds (below 
550 hPa), and was producing cloud top pressures in the 
range 650-750 hPa, significantly above the marine inver- 
sion, which is below 900 hPa. The anatomy of the fail- 
ure was as follows: the assimilation system was trying 
to produce a cloud with a top around 700 hPa (based on 
the erroneous MODIS p c ), but the background relative 
humidity at this height, well above the marine inversion, 
was only about 10%. Clearly the prior in S prevented the 
system from shifting the mean too far from 0.1, so the 
system instead produced a very positively skewed mois- 
ture PDF, with the upper tail just crossing saturation. The 
resulting small cloud fraction (less than 10%) near 700 hPa 
was able to raise the simulated cloud top pressure towards 
the incorrect MODIS value, but at the expense of chang- 
ing the observed cloud fraction of close to 100% to less 
than 10% ! This illustrates the potential problems that can 
arise when assimilating erroneous data, or alternatively, 
in using a forward model that does not well simulate the 
actual observable. In practise, had we made cloud frac- 
tion an additional observable with a prior constraint, then 
the outcome would have been different. In that case, the 
highly skewed, low cloud fraction solution would have 
been impossible, and the analysis would have remained 
near the background, which would have been preferrable. 
But as it was, we chose a better solution: we dropped the 
p c observable in favor of Tj, below 550 hPa. 

5 Comparison between MODIS and SEVIRI assimi- 
lation 

We ran an assimilation with a short 15 minute window 
centered on 13:52:30 GMT of July 1,2011, to compare the 
assimilation of MODIS Aqua cloud retrievals (as above) 
with assimilation of cloud retrievals made by the NASA 
Langley Cloud and Radiation Group from the Spinning 
Enhanced Visible and Infrared Imager (SEVIRI) instru- 
ment aboard the Meteosat-9 geostationary platform (see 
Minnis et al., 2006). [The above time is near the Aqua 
MODIS viewing of 0°N, 0°E, the sub-satellite point of 
Meteosat-9.] The Meteosat-9 SEVIRI retrievals have the 
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Figure 6. A case study of marine stratocumulus off the coast of California and Baja California. The upper panels show brightness 
temperature Tt> and the lower panels all-sky cloud optical thickness (r). The background (left) is absent the stratocumulus near the Southern 
or Baja Californian coast that is present in the observations (middle). The analysis (right) is able to restore this stratocumulus, albeit with 
a slightly higher COT than observed. The somewhat noisier appearance of the analysis comes from the fact that each gridcolumn analysis 

is currently independent of its neighbors. 


advantage of large spatial coverage (including the whole 
of Europe, the Atlantic Ocean, and all of Africa) at high 
temporal resolution (every 15 minutes), compared with 
the twice-a-day overpasses of the Aqua and Terra polar- 
orbiting platforms. The nadir field of view is also reason- 
ably small at 3 km, though not as small as the MODIS 
1-km optical retrievals. From an assimilation point of 
view, the main difference is that only cloud optical thick- 
ness r and brightness temperature Tb are assimilated from 
the SEVIRI retrievals, not C02-slicing cloud top pressure 
p c as for MODIS. Thus, while the MODIS assimilation 
switches between Tj and p c for cloudy pixels, depend- 
ing on the value of p c , as described earlier and in Part 
I, the SEVIRI assimilation always uses Tb when cloud is 
present. 

Comparing Figures 7 and 8, it seems that assimila- 
tion of the two retrieval datasets is comparable (though 
the SEVIRI dataset has the obvious advantage of being 
available as a full-disk image every 15 minutes.) An 
examination of the corresponding O-A PDFs (not shown) 
confirms this comparability, although the ability of the 
assimilation algorithm to remove unobserved cloud from 
the background appears to be a little better for MODIS 
than SEVIRI assimilation. [Both experiments use the 
b7K64fhC32 control with the tropopause (pTrop) and 
Riishojgaard (RiiSl) modifications discussed above.] 


6 Validation using OMI 

The above results provide substantial evidence that the 
Bayesian cloud data assimilation system is at least doing 
what it was designed to do: to drive the subcolumn 
model to a new state more consistent with the obser- 
vations. We also have evidence, from the Stratocumulus 
case study, that the CDA system is also able to handle 
particularly tricky cases where there is no linear sensitiv- 
ity of the observations to the model. But we also want 
a broader global validation of the improvement of the 
analyzed state, as compared with some independent, non- 
assimilated dataset. For this we turn to the so-called Opti- 
cal Centroid Pressure (OCP) retrieved from a rotational- 
Ramman scattering (RRS) algorithm (see Joiner et al., 
2012, hereafter J12) applied to UV spectral measurements 
made by the Ozone Monitoring Instrument (OMI) aboard 
the polar-orbiting Aura satellite (at the rear of the so- 
called “A-Train”, 15 minutes behind Aqua). 

The OCP is a measure of how far into a cloud 
one can see in the UV, which is much further than 
at IR wavelengths. Therefore, the OCP is much deeper 
inside a cloud than the IR-window or C02-slicing derived 
cloud-top pressure (CTP), which is typically very near 
the cloud top for all but very thin clouds. As such, the 
OCP is senstive to both the CTP and the cloud optical 
thickness (COT). That being said, we must realistically 
acknowledge that the Bayesian CDA system descibed 
in this paper (Part I) does not put a strong constraint 
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Figure 7. Cloud optical thickness for an assimilation of MODIS Aqua cloud data for a 15 minute window centered on 13:52:30 GMT of 
July 1, 201 1 . These results should be compared with the SEVIRI assimilation results in Figure 8. 
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Figure 8. As in Figure 7, but for assimilation of cloud retrievals made by the NASA Langley Cloud and Radiation Group for Meteosat-9 

SEVIRI data (see text.) 


on the vertical structure of the cloud, but mainly on 
the cloud top and the overall optical thickness. This 
limitation is particularly true for multi-layer clouds. Thus 
we cannot necessarily expect to get a major improvement 
in OCP simulations from the analyzed cloud state, but 
some improvement is hoped for. 

For the observations we use the OMI OMCLDRR 
product, which contains both the effective cloud fraction 
/ e ff (“CloudFractionfor03”) and optical centroid pressure 
Pqcp (“CloudPressurefor03”) Scientific Datasets (SDS). 
The nadir FOV is approximately 12 km along track and 
24 km across track. We only include FOVs satisfying 
the following conditions, based on guidance from J12 
and the OMCLDRR README file provided with the 
data: (1) the OCP is valid if bits 0, 1, 2, 3, 4, 6, 7, 
13, 14, and 15 of the “ProcessingQualityFlagsfor03” 
SDS are all zero, and if the Focp value itself has some 
positive value; (2) f e g is valid if Focp is, and we further 
require / e g > 0.3, since the algorithm is very noisy for 
small effective cloud fractions; (3) we also exclude FOVs 
flagged as snow/ice by bit 5 of the above processing 
quality flag (pqf), since for these surface types f e g is 
set to one and so Pocp is approximate; (4) we also 
exclude FOVs with radiance or irradiance errors according 
to bits 9 and 11 of the pqf; (5) finally, we exclude 
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FOVs possibly compromised by sea-glint under low cloud 
fraction conditions (namely / e g < 0.3 and bit 4 of the 
‘GroundPixelQuality Flags” SDS set). All FOVs satisfying 
these conditions are binned into the reduced longitude 
grid of the analysis and the mean (in-cloud) .Pocp is 
calculated for each accepted gridcolumn. In addition, only 
those gridcolumns for which the corresponding observed 
gridcolumn mean all-sky COT from Aqua is at least 5 
are included in the analysis, since the retrieved OCP is 
believed to be less reliable for optically thinner clouds (see 
J12). 

We study the day February 13, 2007 and use a 
modified CDA control run that only assimilates Aqua (not 
Terra) observations, since OMI is onboard Aura which 
closely follows behind Aqua. A Pocp is simulated for 
the triangularized background (B) and analyzed (A) states 
using the fast “R 3 S” simulator of J12. Namely Pqcp 
is evaluated according to equation (5) of J12: Pocp = 
S; PiPi/ Xw Pi’ where Pi is the midpoint model layer 
pressure and pi is the contribution to the cumulative UV 
cloud reflectance from layer l, as described in J12 (using 
the adding-doubling method and a conservative delta- 
Eddington two-stream calculation of layer reflectances 
and transmittances, with assymmetry factor g = 0.8.) 
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Figure 9. PDFs of the biases O-B (red) and O-A (blue) of in-cloud mean gridcolumn cloud optical centroid pressure Pocp for a modified 
control run that assimilates only Aqua (not Terra) observations. Biases are shown for six regions, as noted, where the latitude boundaries 

between tropics, mid- and high-latitudes are at 22.5 and 50 degrees. 


Figure 9 shows Pocp bias PDFs for O-B and O-A, 
globally and for different regions. The biases are in the 
in-cloud mean Pocp f° r eac h gridcolumn in the specified 
region and for each of the eight assimilation time windows 
in the 24-hour period. Global mean biases are reduced 
by about 25% and the standard deviation in the biases by 
about 21%. Biases also appear to be more Gaussian after 
assimilation, which is helpful given that these analyzed 
states will eventually become the background for a more 
traditional 3D- or 4D-variational data assimilation. 

It is somewhat dissapointing that the biases were 
not more strongly reduced, but as stated earlier, the CDA 
method does not strongly constrain cloud vertical struc- 
ture, and also, the fast R 3 S OCP simulator only provides 
an approximation to the retieved OCP. To investigate the 
potential role of cloud vertical structure, we also studied 
the global mean bias and standard deviation reductions as 
a function of COT, r. For moderately thin clouds (5 < 
t < 10) the global mean bias was reduced by 33% and the 
standard deviation by 24%. For clouds with 10 < r < 25, 
these reductions are 27% and 23%, and for r > 25, only 
about 12% and 9%. Thus, the reductions do seem to dete- 
riotate with increasing cloud thickness, and especially for 
thick clouds (r > 25), which is consistent with the idea 
that cloud vertical structure will be more important for 
thick clouds. That being said, the addition of the “pTrop” 
modification produced no significant change in the OCP 
bias reductions, and the Riishojgaard modification pro- 
duced a small degradation, so it is clear that the issue of 


constraint of vertical cloud structure is not yet well under- 
stood and needs further study. It should also be noted that 
the Riishojgaard modification still uses the same number 
of MTM trials M per chain point as the control, calcu- 
lated based on the standard pressure scale length method 
of Part I Section 2.9. This should really be modified when 
using the Riishojgaard correlation modification, but it is 
not immediately clear to us how to do that. 


7 Discussion and Conclusions 

Part I of this paper presented a new Monte Carlo Bayesian 
method for constraining a complex statistical model of 
GCM sub-gridcolumn moisture variability using high- 
resolution MODIS cloud data. The method has strong 
application possibilities in cloud data assimilation and 
cloud parameterization testing and development, since it 
can be used to evaluate and improve background (prior) 
estimates of the state of a model gridcolumn and its 
sub-grid parameterization parameters. We have chosen 
one particular gridcolumn statistical model for this study, 
namely skewed triangle PDFs of intra-layer moisture and 
a Gaussian copula to couple them in the vertical, but many 
other more or less complicated models can be conceived 
and intercompared, using this method, for fidelity in 
simulating observed cloud data. 

Part II has performed some basic testing of the 
new method, verifying that it does indeed very signif- 
icantly reduce mean and standard deviation biases with 
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respect to assimilated MODIS cloud optical depth, bright- 
ness temperature and cloud top pressure. Of particu- 
lar- interest, a case study of marine stratocumulus off 
the Californian coast has demonstrated that the Monte 
Carlo method performs well in the especially difficult 
case where the background state is clear but cloudy 
observations exist. In traditional linearized data assimila- 
tion methods, a subsaturated background cannot produce 
clouds via any infinitesimal equilibrium perturbation, but 
the Monte Carlo approach allows finite jumps into regions 
of non-zero cloud probability. In the example provided, 
the method was able to restore marine stratocumulus near 
the Californian coast where the background state had a 
large clear swath. 

One important application of the method is assim- 
ilation of cloud data into large-scale numerical weather 
prediction models. By showing that the method also eas- 
ily assimilates geostationary SEVIRI retrievals, which are 
available every 15 minutes with huge spatial coverage, we 
have opened up the possibility of future significant assim- 
ilation of such retrievals to constrain the global moisture 
field, of which clouds are a significant marker. 

One obvious difficulty for the new method (and 
for other CDA methods as well) is the limited infor- 
mation content on cloud vertical structure in the cho- 
sen cloud observables, cloud top pressure (or bright- 
ness temperature) and column optical depth. Yet even 
these variables do produce a significant improvement in 
the simulated rotational-Ramman scattering cloud opti- 
cal centroid pressure (OCP) against independent (non- 
assimilated) retrievals from the OMI instrument. Because 
one can see deeper into a cloud in the UV, the OCP is sen- 
sitive not only to the cloud top pressure, but also to cloud 
vertical stmcture. Clearly the fidelity of the model’s back- 
ground vertical structure, including boundary layer inver- 
sions, will be important for this problem, and it is impor- 
tant to honor such features in the assimilation process. 
It was found that a simple flow-dependent correlation 
function due to Riishojgaard (1998) provides some help 
in this respect, since it permits stronger decorrelations 
across moisture (or other) inversions in the background 
state. To further improve the analyzed vertical structure of 
the cloud field, future work can consider the addition of 
new observables, with stronger vertical sensitivity, to the 
Bayesian problem: OCP is one such observable, but other 
multispectral passive observations, or active sensors, can 
be considered as well. 


in the SEVIRI comparison and guidance on their use; 
and Ioanna loiner for providing the rotational-Ramman 
scattering OCP fast-simulator code and guidance on the 
use of the OMI OMCLDRR dataset. 

References 

Joiner J, Vasilkov AP, Gupta P, Bhartia PK, Veefkind P, 
Sneep M, de Haan J, Polonsky I, Spurr R. 2012. Fast 
simulators for satellite cloud optical centroid pressure 
retrievals; evaluation of OMI cloud retrievals. Atmos. 
Meas. Tech. 5: 529-545. doi:10.5194/amt-5-529-2012. 

Minnis P, Nguyen L, Smith WL, Palikonda R, Doelling 
DR, Ayers J, Trepte Q, Chang F. 2006. MSG SEVIRI 
applications for weather and climate: Cloud properties 
and calibrations. In Proc. 3rd MSG RAO Workshop, 
Helsinki, Finland, June 15, 2006. pp. 25-30. 

Riishojgaard LP. 1998. A direct way of specifying flow- 
dependent background error correlations for meteoro- 
logical analysis systems. Tellus 50A: 42-57. 


Acknowledgements 

This work was supported by a NASA grant from the 
Modeling, Analyis, and Prediction program (solicitation 
NNH08ZDA001N-MAP, proposal title: “Assimilation of 
A-Train satellite data for constraining a new PDF-based 
cloud parameterization in GEOS-5”, PI: Arlindo da Silva). 
The authors wish to thank Steven Platnick and Gala 
Wind for providing the MODIS cloud data and much 
useful insight into its properties and use; Pat Minnis and 
Rabindra Palikonda for the SEVIRI retrieval files used 


Copyright © 2013 Royal Meteorological Society 
Prepared using qjrms3.cls 


Q. J. R. Meteorol. Soc. 00 : 1—12 (2013) 
DOI: 10. 1002/qj 


